Provide fast write emulation path to release shadow lock.
Basically we can consider shadow fault logic into two parts,
with 1st part to cover logistic work like validating guest
page table or fix shadow table, and the 2nd part for write
emulation.
However there's one scenario we can optimize to skip the
1st part. For previous successfully emulated virtual frame,
it's very likely approaching at write emulation logic again
if next adjacent shadow fault is hitting same virtual frame.
It's wasteful to re-walk 1st part which is already covered
by last shadow fault. In this case, actually we can jump to
emulation code early, without any lock acquisition until
final shadow validation for write emulation. By perfc counts
on 64bit SMP HVM guest, 89% of total shadow write emulation
are observed falling into this fast path when doing kernel
build in guest.
Signed-off-by Kevin Tian <kevin.tian@intel.com>